Nonnative speech recognition based on state-candidate bilingual model modification

نویسندگان

  • Qingqing Zhang
  • Ta Li
  • Jielin Pan
  • Yonghong Yan
چکیده

The speech recognition accuracy has been observed to decrease for nonnative speakers, especially those who are just beginning to learn foreign language or who have heavy accents. This paper presents a novel bilingual model modification approach to improve nonnative speech recognition, considering these great variations of accented pronunciations. Each state of the baseline nonnative acoustic models is modified with several candidate states from the auxiliary acoustic models, which are trained by speakers’ mother language. State mapping criterion and n-best candidates are investigated based on a grammar-constrained speech recognition system. Using the state-candidate bilingual model modification approach, compared to the nonnative acoustic models which have already been well trained by adaptation technique MAP, a Relative reduction of 7.87% in Phrase Error Rate (RPhrER) was further achieved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of a Mandarin-English Bilingual Speech Recognition System with Unified Acoustic Models

This paper presents our recent work on the development of a grammar-constrained, Mandarin-English bilingual Speech Recognition System (MESRS) for real-world music retrieval. Two of the main difficult issues in handling the bilingual speech recognition for realworld applications are tackled: One is to balance the performance and the complexity of the bilingual speech recognition system; the othe...

متن کامل

A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition

Nonnative speech recognition is becoming more and more important as many speech applications are deployed world wide. Meanwhile, due to the large population of nonnative speakers, speaker adaptation remains the most practical way for providing high performance speech services. Subspace Gaussian Mixture Model (SGMM) has recently been shown to yield superior performance on various native speech r...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Online Unsupervised Multilingual Acoustic Model Adaptation for Nonnative Asr

Automatic speech recognition (ASR) is currently one of the main research interests in computer science. Hence, many ASR systems are available in the market. Yet, the performance of speech and language recognition systems is poor on nonnative speech. The challenge for nonnative speech recognition is to maximize the accuracy of a speech recognition system when only a small amount of nonnative dat...

متن کامل

Recent Progress in the Decodin with Multilingual Aco

In this paper we report on recent progress in the use of multilingual Hidden Markov Models for the recognition of non-native speech. While we have previously discussed the use of bilingual acoustic models and recognizer combination methods, we now seek to avoid the increased computational load imposed by methods such as ROVER by focusing on acoustic models that share training data from 5 langua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008